A New Baseline Estimation Method Applied to Arabic Word Recognition
نویسندگان
چکیده
We analyse in this paper the impact of different baseline identification approaches in the case of single word recognition. We show that classical baseline identification approaches using horizontal projection histograms may fail in detecting accurately the baseline of short words, impacting the overall processing chain and inducing errors. From this observation, we propose a novel approach based on stochastic models able to propose probable baseline regions from characters features. Once the most probable baseline region is detected, we fine tune the position of the baseline with an horizontal projection histogram. We ran our experiments in the case of a printed word recognition task using the APTI database and observed a significant increase of performance. Keywords-HMM; GMM; arabic recognition; baseline;
منابع مشابه
Component-based Segmentation of Words from Handwritten Arabic Text
Efficient preprocessing is very essential for automatic recognition of handwritten documents. In this paper, techniques on segmenting words in handwritten Arabic text are presented. Firstly, connected components (ccs) are extracted, and distances among different components are analyzed. The statistical distribution of this distance is then obtained to determine an optimal threshold for words se...
متن کاملEnd-Shape Analysis for Automatic Segmentation of Arabic Handwritten Texts
complies with the regulations of the University and meets the accepted standards with respect to originality and quality. Word segmentation is an important task for many methods that are related to document understanding especially word spotting and word recognition. Several approaches of word segmentation have been proposed for Latin-based languages while a few of them have been introduced for...
متن کاملPerformance of hidden Markov model and dynamic Bayesian network classifiers on handwritten Arabic word recognition
This paper presents a comparative study of two machine learning techniques for recognizing handwritten Arabic words, where hidden Markov models (HMMs) and dynamic Bayesian networks (DBNs) were evaluated. The work proposed is divided into three stages, namely preprocessing, feature extraction and classification. Preprocessing includes baseline estimation and normalization as well as segmentation...
متن کاملRescoring N-Best Hypotheses for Arabic Speech Recognition: A Syntax- Mining Approach
Improving speech recognition accuracy through linguistic knowledge is a major research area in automatic speech recognition systems. In this paper, we present a syntax-mining approach to rescore N-Best hypotheses for Arabic speech recognition systems. The method depends on a machine learning tool (WEKA-3-6-5) to extract the N-Best syntactic rules of the Baseline tagged transcription corpus whic...
متن کاملEfficient System for Speech Recognition using General Regression Neural Network
In this paper we present an efficient system for independent speaker speech recognition based on neural network approach. The proposed architecture comprises two phases: a preprocessing phase which consists in segmental normalization and features extraction and a classification phase which uses neural networks based on nonparametric density estimation namely the general regression neural networ...
متن کامل